Empirically Sampling Universal Dependencies
نویسندگان
چکیده
Universal Dependencies incur a high cost in computation for unbiased system development. We propose a 100% empirically chosen small subset of UD languages for efficient parsing system development. The technique used is based on measurements of model capacity globally. We show that the diversity of the resulting representative language set is superior to the requirements-based procedure.
منابع مشابه
An annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies
A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...
متن کاملConversion from Paninian Karakas to Universal Dependencies for Hindi Dependency Treebank
Universal Dependencies (UD) are gaining much attention of late for systematic evaluation of cross-lingual techniques for crosslingual dependency parsing. In this paper we present our work in line with UD. Our contribution to this is manifold. We extend UD to Indian languages through conversion of Pānịnian Dependencies to UD for the Hindi Dependency Treebank (HDTB). We discuss the differences in...
متن کاملBinding Phenomena within a Reductionist Theory of Grammatical Dependencies
Title of dissertation: BINDING PHENOMENA WITHIN A REDUCTIONIST THEORY OF GRAMMATICAL DEPENDENCIES Alex Drummond, Doctor of Philosophy, 2011 Dissertation directed by Norbert Hornstein Department of Linguistics This thesis investigates the implications of binding phenomena for the development of a reductionist theory of grammatical dependencies. The starting point is the analysis of binding and c...
متن کاملUniversal Decompositional Semantics on Universal Dependencies
We present a framework for augmenting data sets from the Universal Dependencies project with Universal Decompositional Semantics. Where the Universal Dependencies project aims to provide a syntactic annotation standard that can be used consistently across many languages as well as a collection of corpora that use that standard, our extension has similar aims for semantic annotation. We describe...
متن کاملA Hybrid Approach for Probabilistic Inference using Random Projections
We introduce a new meta-algorithm for probabilistic inference in graphical models based on random projections. The key idea is to use approximate inference algorithms for an (exponentially) large number of samples, obtained by randomly projecting the original statistical model using universal hash functions. In the case where the approximate inference algorithm is a variational approximation, t...
متن کامل